Novel chromosome 21 gene marker, compositions and methods using same

ABSTRACT

The present invention provides isolated nucleic acids encoding human EHOC-1 protein and isolated receptor proteins encoded thereby. Further provided are vectors containing invention nucleic acids, probes that hybridize thereto, host cells transformed therewith, antisense oligonucleotides thereto and compositions containing, antibodies that specifically bind to invention polypeptides and compositions containing, as well as transgenic non-human mammals that express the invention protein.

ACKNOWLEDGEMENT

[0001] This invention was made in part with Government support underGrant No. HD17449-11, from the National Institutes of Child Health andHuman Development. The Government may have certain rights in thisinvention.

BACKGROUND OF THE INVENTION

[0002] A major endeavor in molecular genetics has been made ingenerating maps of the human genome. Human genome mapping consists,generally, of ordering genomic DNA fragments on their chromosomes usingseveral methods, such as fluorescent n situ hybridization (FISH),somatic cell hybrid analysis or random clone fingerprinting. DNAfragments that correspond to marked polymorphic sites can be ordered bygenetic linkage analysis. Distances between polymorphic loci areestimated by meiotic recombination frequencies. High resolution mapsbased upon the estimated distances, however, cannot be constructedeasily using such methods because the resolution is low at the molecularlevel and recombination frequency is not linearly correlated withphysical distance.

[0003] Thus, various obstacles such as, for example, the difficulty inobtaining highly informative markers and the paucity of identifiedmarkers that are evenly spaced along the chromosome are significantweaknesses of the currently available genetic maps. Most of the mappedmarkers are restriction fragment length polymorphisms (RFPLs) assayed byDNA hybridization. Although maps based on these markers have contributedgreatly to the primary mapping of a number of diseases, they are stillinsufficient for many applications such as mapping rare monofactorialdiseases, refining linkage intervals to distances suited for geneidentification, and mapping of loci contributing to complex traits.

[0004] Genetic linkage mapping is an important technology applied to thestudy of human biology and, in particular, for the delineation of themolecular basis of disease. Indeed, one of the most commonly usedstrategies for studying human inherited diseases is by cloning theresponsible gene based on chromosomal location. Genetic linkage maps,therefore, facilitate the identification and mapping of genes involvedin monogenic diseases, genes involved in multifactorial disorders, andare useful in carrier detection and prenatal diagnosis of hereditarydisorders. A detailed linkage map is also a prerequisite for clone-basedphysical mapping and DNA sequencing of the entire chromosome.

[0005] Human chromosome 21 is a paradigm for large-scale human genomemappind efforts. The smallest human chromosome, chromosome 21 hasapproximately 50 megabases (Mb) of DNA. Less than 1% of the 2000 genesestimated to be on chromosome 21 are known. A high resolution map ofchromosomes of particular interest because of is apparent role infamilial Alzheimer disease (FAD), Down's syndrome (DS), amyotrophiclateral sclerosis (ALS), and Finnish progressive myoclonus epilepsy(PME). A gene defect responsible for FAD has been localized tochromosome 21 on the basis of genetic linkage to three pericentromericloci. The gene encoding the precursor of the Alzheimer-associatedamyloid β protein (APP), the principle component of the senile plaquesand cerebrovascular amyloid deposits of Alzheimer disease (AD), has alsobeen mapped to chromosome 21.

[0006] The process of developing such a long-range contig map involvesthe identification and localization of landmarks in cloned geneticfragments. When there are enough landmarks for the size of the clonedfragments, contigs are formed, and the landmarks are simultaneouslyordered. Currently, YACs, or yeast artificial chromosomes, are utilizedfor most mapping of the human genome. YACs permit cloning of fragmentsof ≧ about 500 Kb. However, some difficulties have been encountered withthe manipulation of YAC libraries. For example, in various YAClibraries, a fraction of the clones result from co-cloning events, i.e.,they include in a single clone noncontiguous DNA fragments. A highpercentage of YAC clones, particularly clones having high molecularweight inserts, are chimeric. Chimeric clones map to multiple sites onthe chromosome and, thus, hammer the progress of mapping and analysis.Another problem endemic to YAC cloning is caused by DNA segments thatare unclonable or unstable and tend to rearrange and delete.

[0007] Bacteria Artificial Chromosomes (BACs), provide an alternative tothe YAC system. BACs mitigate the most problematic aspects of YACs suchas, for example the high rate of chimerism and clonal instability. BACsare based on the E. coli single-copy plasmid F factor and are capable offaithful propagation of DNA fragments greater than about 300 Kb in size.BACs have a number of physical properties that make them amenable tophysical mapping, including easy manipulation and an absence ofchimerism. The lack of chimerism and the capacity to propagate largeexogenous insert DNAs make the BACs excellent candidates for chromosomewalking and the generation of contiguous physical maps.

[0008] The need for molecular description of chromosome 21 derivesdirectly from the association with several human genetic diseases. A mapof contiguous units (contigs) covering this chromosome will speed theidentification of the cause of these diseases. Indeed, a detailed mapwould provide immediate access to the genomic segmen, including anypathological locus, as soon as it has been localized by genetic linkageor cytogenetic analysis.

[0009] Thus, a need exists for identifying, characterizing, and mappingthe numerous genes that occupy loci on chromosome 21, which willexpedite the rapid translation of high resolution chromosome maps intobiological, medical and diagnostic applications. The present inventionsatisfies this need and provides related advantages as well.

SUMMARY OF THE INVENTION

[0010] The present invention provides isolated nucleic acids encodinghuman EHOC-1 protein and isolated receptor proteins encoded thereby.Further provided are vectors containing invention nucleic acids, probesthat hybridize thereto, host cells transformed therewith, antisenseoligonucleotides thereto and compositions containing, antibodies thatspecifically bind to invention polypeptides and compositions containing,as well as transgenic non-human mammals that express the inventionprotein.

BRIEF DESCRIPTION OF THE FIGURES

[0011]FIG. 1 shows a physical map thr the consensus region for HPE1.

[0012]FIG. 2 shows a physical map for the consensus region for EPM1 inrelation to the consensus region for HPE1. The locations of YAC clones,BAC clones and EHOC-1 were indicated by thick bars.

[0013]FIG. 3A shows the regions in which EHOC-1 has homologies totransmembrane proteins. Region 1 represents 29.4% identity in a 34 aminoacid overlap with rat sodium channel protein III. Region 2 represents20.4% identity in a 103 amino acid overlap with phosphoglyceratetransport system regulatory protein of Salmonella typhimurin. Region 3represents 29.1; identity in a 55 amino acid overlap withpyrophosphate-energized vacuolar membrane proton pump of Arabidopsisthaliana. Region 4 represents 24.0% identity in a 50 amino acid overlapwith myosin-like protein of Saccharomyces cerevisiae. Region 5represents 17.9% identity in a 39 amino acid overlap with rabbit cardiacmuscle ryanodine receptor. Region 6 represents 21.0% identity in a 62amino acid overlap with rat cardiac muscle sodium channel protein alphasubunit. Region 7 represents 40.7% identity in a 27 amino acid overlapwith rat skeletal muscle sodium channel protein alpha subunit. Region 8represents a 30.3% identity in a 33 amino acid overlap with dystrophincysteine-rich domain.

[0014]FIG. 3B shows a comparison of genetic distances to the EPM1 locusin centiMorgans as computed by linkage diseauilibrium studies.(Lehesjoki et al., Hum. Mol. Genet. 2:1229-1234 (1993)).

DETAILED DESCRIPTION OF THE INVENTION

[0015] Progressive myoclonus epilepsies (PMEs) are a heterogenous groupof diseases which are characterized by myoclonus, epileptic seizures andprogressive neurological deterioration including ataxia and dementiaBerkovic et al., New Engl. J. Med. 315:296-305 (1986). PME ofUnverricht-Lundborg type (EPM1) is an autosomal recessive disorder withfrequent consanguinity in Finland and Mediterranean regions with theincidence of at least 1: 20,000 in Finland. Genetic linkage analysisrevealed that the locus for EPM1 is on chromosome 21q22.3 Malafosse etal., Lancet 339:1080-1081 (1992) and excluded Lafora disease from thisregion which is also a member of PME Lehesjoki et al., Neurology42:1545-1550 (1992). Linkage disecuilibrium analysis made it possible tonarrow down the candidate region to 300 kb spanning the loci of PFKL,D21S25 and D21S154 Lehesjoki et al., Hum. Mol. Genet. 2:1229-1234(1993); Lehesjoki et al., Human Genetics 93:668-674 (1994).

[0016] Autoimmune polyglandular disease type 1 (APECED) as also mappedto chromosome 21q22.3 by linkage disequilibrium analysis Aaltonen, J.,et al., Nature Genet. 8:83-87 (1994). APECED is an autosomal recessivedisease resulting in a variable combination of failure of theparathyroid glands, adrenal cortex, gonads pancreatic β cells, thyroidgland and gastric parietal cells. Additional affects of APECED includealopecia, vitiligo, hepatitis, chronic mucoccutaneous candidiasis,dystrophy of the dental enamel and nails and keratopathy. APCED usuallymanifests itself in childhood, but tissue specific symptoms may appearthroughout adulthood. The APCED locus maps within 500 kb of D21S49 andD21S171.

[0017] Holoprosencephaly is characterized by impaired cleavage of theembryonic forebrain and incomplete mid-facial development that manifestas a wide range of midfacial anomalies including cyclopia, ethmocephalycebocephaly, premaxillary agenesis, hypotelorism, and a single maxillarycentral incisor. The most commonly associated chromosomal abnormaliyincludes dup(3p), del(7q), deletions of chromosome 13, trisomy 13,trisomy 18, and triploidy (Munke, AM J Med Genet 34:237-245 (1989)). Theetiology is heterogeneous and may include aneuploidies for chromosomes2, 3, 7, 13, 18 and 21. In order to narrow down the candidate region forHPE1, the deletion of 21(q22.3) was characterized in two HP patients byfluorescence in situ hybridization and quantitative Southern blot dosageanalysis. For the smaller deletion, the regions for D21S25, D21S154,D21S171 and D21S44 were deleted and for D21S42 and D21S49 were not.Combining these data with previous reports of deletion of 21q22.3(D21S112-ter) without the holoprosencephaly phenotype indicate that theregion responsible for holoprosencephaly spans the 1-2Mb regionincluding PFKL and ITGB2 (CD18). Four cases of holoprosencephaly withchromosome 21 anomalies have been published. Estabrooks et al. describea minute deletion of chromosome 21(q22.3) (Estabrooks et al., AM J MedGenet, 36:306-309 (1990)) suggesting this region as a locus forholoprosencephaly (HPE1).

[0018] Described in the instant specification is the construction of theBAC (Bacterial Artificial Chromosome) Shizuya et al., Proc. Natl. Acad.Sci. USA 89:8794-8797 (1992) contig of this EPM1-APECED-HPE1 candidateregion and the isolation of a novel gene from this contiguous map unitusing a direct cDNA selection technique.

[0019] In order to isolate genes responsible for these diseases, a cDNAlibrary from a 14-week trisomy 21 fetal brain was constructed usingUni-Zap XR (Stragene, La. Jolla, Calif.). More than 950 clones haveinserts ranging from 1-4 kb (avg. 2 kb). in addition, a direct cDNAselection method was applied so BACs (Bacterial Artificial Chromosomes)in the 21q22.3 region. Using cDNA synthesized from trisomy 21 fetalbrain, Sau3AI linkers were attached, the cDNA then was digested withSau3AI, followed by attachment of a second pair of linkers andhybridized o biotinylated BAC DNAs which cover the candidate region.cDNA/BAC DNA hybrid molecules were captured on streptavidin coatedmagnetic beads, non-specific cDNA were washed out, and specificallyhybridized cDNA were eluted and amplified by PCR. Twice selected PCRproducts were subcloned and analyzed. Southern blot analysis revealedthat 21 out of 30 (70%) of fragments yielded unique bands of theoriginal BACs. Using these fragments as probes, cDNAs (3 kb, 4 kb and 5kb) were isolated from the library. The 5 kb cDNA subclone (EHOC-1) mapsproximal to but neighboring D21S25 and showed homologies totransmembrane genes. The loci of these genes all map within theconsensus region where holoprosencephaly, EPM1 and APECED are localized.

[0020] DNA sequence analysis of Skb cDNA showed a complete codingsequence of 3570 bp which revealed to have homologies to transmembraneproteins including three kinds of sodium channel proteins on amino acidsequence level. (SEQ ID NOS: 1-2; FIG. 3).

[0021] Five types of BAC clone were isolated from the total humangenomic DNA BAC library Shizuya et al., Proc. Natl. Acad. Sci. USA89:8794-8797 (1992) by PCR screening method using STSs containing PFKL,D21S25, D21S154 and CD18. Physical maps of he HPE1-EPM1-APECED consensusregion with these BAC clones and YAC clones Chumakov et al., Nature359:380-387 (1992) is described in FIGS. 1 and 2. BAC-1 (230 kb) andBAC-2 (210 kb) were positive for D21S25. BAC-3 (170 kb) was positive forD21S25 and PFKL. Agarose gel electrophoresis of EcoRI-digested BAC DNAsand Southern blot analysis showed that these 3 BACs overlap each other.BAC-4 was identical to BAC-3. BAC-S (100 kb) was positive for CD18.

[0022] Direct cDNA selection was performed on 5 BAC DNAs (four of whichwere overlapping) which span the consensus region. EcoRI digestion ofsubclone DNAs revealed that 10% clones were chimeric. The average sizesof the inserts of non-chimeric clones were 400 bp. Forty non-chimericsubclones of selected cDNAs were analyzed by using EcoRI-digested BACDNA Southern blots. Twenty-eight clones (70%) showed unique signals onthe BAC blots, 6 clones (15%) showed repetitive, and 6 clones (15%) didnot show any signal on these blots. Using insert DNAs of these subclonesas probes, a trisomy 21 fetal brain cDNA library was screened. Threeoverlapping cDNAs (3 kb, 4 kb and 5 kb) containing poly (A) tail wereisolated and designated EHOC-1.

[0023] The three overlapping EHOC-1 cDNA subclones were used forSouthern blot analysis using EcoRI-digested BAC DNA blots. Only BAC-1showed unique multiple band signals indicating that these cDNAsoriginated from BAC-1. Identical sizes of the signal bands indicatedthat these clones overlap each other. Complete sequence of the EHOC-1 5kb cDNA clone and partial sequence analysis of 3 kb and 4 kb clonesshowed that entire sequence of the 3 kb clone and part of the sequenceof 4 kb clone are contained in the 5 kb clone, but the 3′ end of the 4kb clone was different from that of 5 kb clone indicating the existenceof splice variants of EHOC-1 cDNAs. Northern blot analysis using theinsert of 5 kb EHOC-1 cDNA revealed three transcripts (5.3 kb, 7.5 kband 8 kb) on multiple adult tissues (heart, brain, placenta, lung,liver, skeletal muscle, kidney, pancreas) Fluorescence in-situhybridization was also done on lymphocytes of a normal individual usinginsert of the EHOC-1 5 kb cDNA subclone. Discrete signals were seen onchromosome 21q22.3 confirming the loci. The complete sequence of 5 kbcDNA clone revealed an open reading frame of 3570bp (1190 amino acid).The initiator ATG was located within a good Kozak consensus sequenceKozak, M., J. Mol. Biol. 196:947-950 (1987); Kozak, M., Nuc. Acid Res.15:8125-8148 (1987). Homology search of an amino acid sequence of thisORF to genes registered in Genbank/EMBL showed that this gene producthas homologies to multiple transmembrane proteins including three typesof sodium channel proteins (FIG. 3).

[0024] Some neurologic disorders in humans are known to result frommutations in sodium channels Ptacek et al., Cell 67:1021-1027 (1991);Rojas et al., Nature 354:387-389 (1991); McClatchey et al., Cell68:769-774 (1992); Ptacek et al., Neuron 8:891-897 (1992), calciumchannels Ptacek et al., Cell 77: 863-868 (1994); Jurkut-Rott et al.,Hum. Mol. Genet. 3:1415-1419 (1994), and a potassium channel Browne etal., Nature Genet. 3:136-140 (1994). By using BLAST computer programAltschul, S.J., et al., J. Mol. Biol. 215:403-410 (1994), onefibronectin domain (CxV . . . YxC) was found at 356⇄401 a.a. Theanalysis also showed that the motif (Sxxx(I,L)E) occurs at 462, 670,708, 716, 730, and 1078. This motif was searched for various proteindatabases and there were very few where it was present three or moretimes. These include; rat cartilage specific proteoglycan core protein,myosin, drosodhila sevenless (4 copies), drosophila prospero (4 copies),and drosophila serendipity (3 copies). The latter three are mutants indevelopment. Sevenless causes an eye defect, prospero defects in axonpathfinder, and serendipity defects in cellularization. It is reasonablethat a defect in axonal routing may correlate with the phenotype ofEHOC-1. The region beginning at 777 also has some homologies to multipledrug resistance genes and to the drosophila rutabaga gene. Rutabaaa isinvolved in learning in drosophila.

[0025] Accordingly, the present invention provides isolated nucleicacids encoding a novel gene, EHOC-1, which exists in human chromosome21, specifically at the q23.2 locus, which is the site of mutation(s)that cause PME, HPE1, and APECED. The term “Nucleic acids” (alsoreferred to as polynucleotides) encompasses RNA as well as single anddouble-stranded DNA and cDNA. As used herein, the phrase “isolated”means a nucleic acid that is in a form that does not occur in nature.One means of isolating a nucleic acid encoding an EHOC-1 polypeptide isto probe a mammalian genomic library with a natural or artificiallydesigned DNA probe using methods well known in the art. DNA probesderived from the EHOC-1 gene are particularly useful for this purpose.DNA and cDNA molecules that encode EHOC-1 polypeptides can be used toobtain complementary genomic DNA, cDNA or RNA from human, mammalian, orother animal sources, or to isolate related cDNA or genomic clones bythe screening of cDNA or genomic libraries, by methods described in moredetail below. Examples of nucleic acids are RNA, cDNA, or isolatedgenomic DNA encoding an EHOC-1 polypeptide. Such nucleic acids may havecoding sequences substantially the same as the coding sequence shown inSEQ ID NO: 2. This invention also encompasses nucleic acids which differfrom the nucleic acids shown in SEQ ID NO: 1, but which have the samephenotype, i.e., encode substantially the same amino acid sequence setforth in SEQ ID NO: 2.

[0026] Phenotypically similar nucleic acids are also referred to as“functionally equivalent nucleic acids”. As used herein, the phrase“functionally equivalent nucleic acids” encompasses nucleic acidscharacterized by slight and non-consequential sequence variations thatwill function in substantially the same manner to produce the sameprotein product(s) as the nucleic acids disclosed herein. In particular,functionally equivalent nucleic acids encode polypeptides that are thesame as those disclosed herein or that have conservative amino acidvariations. For example, conservative variations include substitution ofa non-polar residue with another non-polar residue, or substitution of acharged residue with a similarly charged residue. These variationsinclude those recognized by skilled artisans as those that do notsubstantially alter the tertiary structure of the protein.

[0027] Further provided are nucleic acids encoding EHOC-1 polypeptidesthat, by virtue of the degeneracy of the genetic code, do notnecessarily hybridize to the invention nucleic acids under specifiedhybridization conditions. Preferred nucleic acids encoding the inventionpolypeptide are comprised of nucleotides that encode substantially thesame amino acid sequence set forth in SEQ ID NO: 2. Alternatively,preferred nucleic acids encoding the invention polypept4de(s) hybridizeunder high stringency conditions to substantially the entire sequence,or substantial portions (i.e., typically at least 15-30 nucleotides) ofthe nucleic acid sequence set forth in SEQ ID NO: 1.

[0028] Stringency of hybridization, as used herein, refers to conditionsunder which polynucleotide hybrids are stable. As known to those ofskill in the art, the stability of hybrids is a function of sodium ionconcentration and temperature. (See, for example, Sambrook et al.,Molecular Cloning: A Laboratory Manual 2d Ed. (Cold Spring HarborLaboratory, 1989; incorporated herein by reference)

[0029] Also provided are isolated peptides, polypeptides (s) and/orprotein(s) encoded by the invention nucleic acids which are EHOC-1polypeptides. The EHOC-1 polypeptide comprises a protein ofapproximately 1190 amino acids in length. The predicted amino acidsequence encoding the EHOC-1 polypeptide is set forth in SEQ ID NO: 2.

[0030] As used herein, the term “isolated” means a protein molecule freeof cellular components and/or contaminants normally associated with anative in vivo environment. Invention polypeptides and/or proteinsinclude any natural Occurring allelic variant, as well as recombinantforms thereof. The EHOC-1 polypeptides can be isolated using variousmethods well known to a person of skill in the art. The methodsavailable for the isolation and purification of invention proteinsinclude, precipitation, gel filtration, ion-exchange, reverse-phase andaffinity chromatography. Other well-known methods are described inDeucscher et al., Guide to Protein Purification: Methods in EnzymologyVol. 182, (Academic Press, 1990), which is incorporated herein byreference. Alternatively, the isolated polypeptides of the presentinvention can be obtained using well-known recombinant methods asdescribed, for example, in Sambrook et al., supra., 1989).

[0031] An example of the means for preparing the invention polypeptides)is to express nucleic acids encoding the EHOC-1 in a suitable host cell,such as a bacterial cell, a yeast cell, an amphibian cell (i.e.,oocyte), or a mammalian cell, using methods well known in the art, andrecovering the expressed polypeptide, again using well-known methods.Invention polypeptides can be isolated directly from cells that havebeen transformed with expression vectors, described below in moredetail. The invention polypeptide, biologically active fragments, andfunctional equivalents thereof can also be produced by chemicalsynthesis. As used herein, “biologically active fragment” refers to anyportion of the polypeptide *represented by the amino acid sequence inSEQ ID NO: 2 that can assemble into a cationic channel permeable to Ca²⁺which is activated by acetylcholine. Synthetic polypeptides can beproduced using Applied Biosystems, Inc. Model 430A or 431A automaticpeptide synthesizer (Foster City, Calif.) employing the chemistryprovided by the manufacturer.

[0032] As used herein, the phrase “EHOC-1” refers to recombinantlyexpressed/produced (i.e., isolated or substantially pure) proteins thatcontain highly hydrophobic regions which predict potential membranespanning regions, having homologies to multiple transmembrane proteins,including sodium channel, calcium channel and potassium channel proteinsincluding variants thereof encoded by mRNA generated by alternativesplicing of a primary transcript, and further including fragmentsthereof which retain one or more of the aforementioned properties. Asused herein, the phrase “functional polypeptide” means that binding ofligands, for example, cause transcriptional activation of EHOC-1proteins. More specifically, agonist activation of a “functionalinvention polypeptide” induces the protein to generate a signal.

[0033] Modification of the invention nucleic acids, polypeptides orproteins with the following phrases: “recombinantly expressed/produced”,“isolated”, or “substantially pure”, encompasses nucleic acids,peptides, polypeptides or proteins that have been produced in such formby the hand of man, and are thus separated from their native in vivocellular environment. As a result of this human intervention, therecombinant nucleic acids, polypeptides and proteins of the inventionare useful in ways that the corresponding naturally occurring moleculesare not, such as identification of selective drugs or compounds.

[0034] Sequences having “substantial sequence homology” are intended torefer to nucleotide sequences that share at least about 90% identitywith invention nucleic-acids; and amino acid sequences that typicallyshare at least about 95% amino acid identity with inventionpolypeptides. It is recognized, however, that polypeptides or nucleicacids containing less than the above-described levels of homologyarising as splice variants or that are modified by conservative aminoacid substitutions, or by substitution of degenerate codons are alsoencompassed within the scope of the present invention.

[0035] The present invention provides the isolated polynucleotideoperatively linked to a promoter of RNA transcription, as well as otherregulatory sequences. As used herein, the phrase “operatively linked”refers to the functional relationship of the polynucleotide withregulatory and effector sequences of nucleotides, such as promoters,enhancers, transcriptional and translational stop sites, and othersignal sequences. For example, operative linkage of a polynucleotide toa promoter refers to the physical and functional relationship betweenthe polynucleotide and the promoter such that Transcription of DNA isinitiated from the promoter by an RNA polymerase that specificallyrecognizes and binds to the promoter, and wherein the promoter directsthe transcription of RNA from the polynucleotide.

[0036] Promoter regions include specific sequences that are sufficientfor RNA polymerase recognition, binding and transcription initiation.Additionally, promoter regions include sequences that modulate therecognition, binding and transcription initiation activity of RNApolymerase. Such sequences may be cis acting or may be responsive totrans acting factors. Depending upon the nature of the regulation,promoters may be constitutive or regulated. Examples of promoters areSP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) promoter, mousemammary tumor virus (MMTV) steroid-inducible promoter, Moloney murineleukemia virus (MMLV) promoter, and the like.

[0037] Vectors that contain both a promoter and a cloning site intowhich a polynucleotide can be operatively linked are well known in theart. Such vectors are capable of transcribing RNA in vitro or in vivo,and are commercially available from sources such as Stratagene (La.Jolla, Calif.) and Promega Biotech (Madison, Wis.). In order to optimizeexpression and/or in vitro transcription, it may be necessary to remove,add or alter 5′ and/or 3′ untranslated portions of the clones toeliminate extra, potential inappropriate alternative translationinitiation codons or other sequences that may interfere with or reduceexpression, either at the level of transcription or translation.Alternatively, consensus ribosome binding sites can be insertedimmediately 5′ of the start codon to enhance expression. (See, forexample, Kozak, J. Biol. Chem. 266:-9867 (1991)). Similarly, alternativecodons, encoding the same amino acid, can be substituted for codingsequences of the EHOC-1 polypeptide in order to enhance transcription(e.g., the codon preference of the host cell can be adopted, thepresence of G-C rich domains can be reduced, and the like).

[0038] Also provided are vectors comprising the invention nucleic acids.Examples of vectors are viruses, such as baculoviruses and retroviruses,bacteriophages, cosmids, plasmids and other recombination vehiclestypically used in the art. Polynucleotides are inserted into vectorgenomes using methods well known in the art. For example, insert andvector DNA can be contacted, under suitable conditions, with arestriction enzyme to create complementary ends on each molecule thatcan pair with each other and be joined together with a ligase.Alternatively, synthetic nucleic acid linkers can be ligated to thetermini of restricted polynucleotide. These synthetic linkers containnucleic acid sequences that correspond to a particular restriction sitein the vector DNA. Additionally, an oligonucleotide containing atermination codon and an appropriate restriction site can be ligated forinsertion into a vector containing, for example, some or all of thefollowing: a selectable marker gene, such as the neomycin gene forselection of stable or transient transfectants in mammalian cells;enhancer/promoter sequences from the immediate early gene of human CMVfor high levels of transcription; transcription termination and RNAprocessing signals from SV40 for mRNA stability; SV40 polyoma origins ofreplication and ColEl for proper episomal replication; versatilemultiple cloning sites; and T7 and SP6 RNA promoters for in vitrotranscription of sense and antisense RNA. Other means are well known andavailable in the art.

[0039] Also provided are vectors comprising a nucleic acids encoding anEHOC-1 polypeptide, adapted fo: expression in a bacterial cell, a yeastcell, an amphibian cell (i.e., oocyte), a mammalian cell and otheranimal cells. The vectors additionally comprise the regulatory elementsnecessary for expression of the nucleic acid in the bacterial, yeast,amphibian, mammalian or animal cells so located relative to the nucleicacid encoding EHOC-1 polypeptide as to permit expression thereof. Asused herein, “expression” refers to the process by which nucleic acidsare transcribed into mRNA and translated into peptides, polypeptides, orproteins. If the nucleic acid is derived from genomic DNA, expressionmay include splicing of the mRNA, if an appropriate eucaryotic host isselected. Regulatory elements required for expression include promotersequences to bind RNA polymerase and transcription initiation sequencesfor ribosome binding. For example, a bacterial expression vectorincludes a promoter such as the lac promoter and for transcriptioninitiation the Shine-Dalgarno sequence and the start codon AUG (Sambrooket al. supra). Similarly, a eucaryotic expression vector includes aheterologous or homologous promoter for RNA polymerase II, a downstreampolyadenylationn signal, the start codon AUG, and a termination codonfor detachment of the ribosome. Such vectors can be obtainedcommercially or assembled by the sequences described in methods wellknown in the art, for example, the methods described above forconstructing vectors in general. Expression vectors are useful toproduce cells that express the invention polypeptide.

[0040] This invention provides a transformed host cell thatrecombinantly expresses the EHOC-1 polypeptide. The host cell has beentransformed with a nucleic acid encoding a EIOC-1 polypeptide. Anexample is a mammalian cell comprising a plasmid adapted for expressionin a mammalian cell. The plasmid contains a nucleic acid encoding anEHOC-1 polypeptide and the regulatory elements necessary for expressionof the invention protein. Various mammalian cells may be utilized ashosts, including, for example, mouse fibroblast cell NIH3T3, CHO cells,HeLa cells, Ltk-cells, etc. Expression plasmids such as those describedsupra can be used to transfect mammalian cells by methods well known inthe art such as calcium phosphate precipitation, DEAE-dextran,electroporation, microinjection or lipofection.

[0041] EHOC-1 polypeptides expressed recombinantly on eucaryotic cellsurfaces may contain at least one EHOC-1 polypeptide, or may contain amixture of peptides encoded by the host cell and/or subunits encoded byheterologous nucleic acids.

[0042] The present invention provides nucleic acid probes comprisingnucleotide sequences capable of specifically hybridizing with sequencesincluded within the nucleic acid sequence encoding an EHOC-1polypeptide, for example, a coding sequence included within thenucleotide sequence shown in SEQ ID NO: 1. As used herein, a “probe” isa single-stranded DNA or RNA that has a sequence of nucleotides thatincludes at least 15 contiguous bases set forth in SEQ ID NO: 1.Preferred regions from which to construct probes include c, and/or 3′coding sequences, sequences within the ORF, sequences predicted toencode transmembrane domains, sequences predicted to encode cytoplasmicloops, signal sequences, ligand binding sites, and the like. Full-lengthor fragments of cDNA clones can also be used as probes for the detectionand isolation of related genes. When fragments are used as probes,preferably the cDNA sequences will be from the carboxyl end-encodingportion of the cDNA, and most preferably will include predictedtransmembrane domain-encoding portions of the cDNA sequence.Transmembrane domain regions can be predicted based on hydropathyanalysis of the deduced amino acid sequence using, for example, themethod of Kyte and Doolittle, J. Mol. Biol. 157:105 (1982).

[0043] As used herein, the phrase “specifically hybridizing” encompassesthe ability of a polynucleotide to recognize a sequence of nucleic acidsthat are complementary thereto and to form double-helical segments viahydrogen bonding between complementary base pairs. Nucleic acid probetechnology is well known to those skilled in the art who will readilyappreciate that such probes may vary greatly in length and may belabeled with a detectable agent, such as a radioisotope, a fluorescentdye, and the like, to facilitate detection of the probe. Inventionprobes are useful to detect the presence of nucleic acids encoding theEROC-1 polypeptide. For example, the probes can be used for in situhybridizations in order to locate biological tissues in which theinvention gene is expressed. Additionally, synthesized oligonucleotidescomplementary to the nucleic acids of a nucleotide sequence encodingEHOC-1 polypeptide are useful as probes for detecting the inventiongenes, their associated mRNA, or for the isolation of related genesusing homology screening of genomic or cDNA libraries, or by usingamplification techniques well known to one of skill in the art.

[0044] Also provided are antisense oligonucleotides having a sequencecapable of binding specifically with any portion of an mRNA that encodesthe EHOC-1 polypeptide so as to prevent translation of the mRNA. Theantisense oligonucleotide may have a sequence capable of bindingspecifically with any portion of the sequence of the cDNA encoding theEHOC-1 polypeptide. As used herein, the phrase “binding specifically”encompasses the ability of a nucleic acid sequence to recognize acomplementary nucleic acid sequence and to form double-helical segmentstherewith via the formation of hydrogen bonds between the complementarybase pairs. An example of an antisense oligonucleotide is an antisenseoligonucleotide comprising chemical analogs of nucleotides.

[0045] Compositions comprising an amount of the antisenseoligonucleotide, described above, effective to reduce expression of theEHOC-1 polypeptide by passing through a cell membrane and bindingspecifically with mRNA encoding a EHOC-1 polypeptide so as to preventits translation and an acceptable hydrophobic carrier capable of passingthrough a cell membrane are also provided herein. The acceptablehydrophobic carrier capable of passing through cell membranes may alsocomprise a structure which binds to a receptor specific for a selectedcell type and is thereby taken up by cells of the selected cell type.The structure may be part of a protein known to bind to a cell-typespecific receptor.

[0046] Antisense oligonucleotide compositions Inhibit translation ofmRNA encoding the invention polypeptides. Synthetic oligonucleotides, orother antisense chemical structures are designed to bind to mRNAencoding the EHOC-1 polypeptides and inhibit translation of mRNA and areuseful as compositions to inhibit expression of EHOC-1 associated genesin a tissue sample or in a subject.

[0047] This invention provides a means to modulate levels of expressionof EHOC-1 polypeptides by the use of a synthetic antisenseoligonucleotide composition (hereinafter SAOC) which inhibitstranslation of mRNA encoding these polypeptides. Syntheticoligonucleotides, or other antisense chemical structures designed torecognize and selectively bind to mRNA, are constructed to becomplementary to portions of the nucleotide sequences shown in SEQ IDNO: 1 of DNA, RNA or chemically modified, artificial nucleic acids. TheSAOC is designed to be stable in the blood stream for administration toa subject by injection, or in laboratory cell culture conditions. TheSAOC is designed to be capable of passing through the cell membrane inorder to enter the cytoplasm of the cell by virtue of physical andchemical properties of the SAOC which render it capable of passingthrough cell membranes, for example, by designing small, hydrophobicSAOC chemical structures, or by virtue of specific transport systems inthe cell which recognize and transport the SAOC into the cell. Inaddition, the SAOC can be designed for administration only to certainselected cell populations by targeting the SAOC to be recognized byspecific cellular uptake mechanisms which bind and take up the SAOC onlywithin select cell populations. For example, the SAOC may be designed tobind to a receptor found only in a certain cell type, as discussedsupra. The SAOC is also designed to recognize and selectively bind tothe target mRNA sequence, which may correspond to a sequence containedwithin the sequence shown in SEQ ID NO: 1. The SAOC is designed toinactivate the target mRNA sequence by either binding to the target mRNAand inducing degradation of the mRNA by, for example, RNase I digestion,or inhibiting translation of the mRNA target by interfering with thebinding of transiation-regulating factors or ribosomes, or inclusion ofother chemical structures, such as ribozyme sequences or reactivechemical groups which either degrade or chemically modify the targetmRNA. SAOCs have been shown to be capable of such properties whendirected against mRNA targets (see Cohen et al., TIPS, 10:435 (1989) andWeintraub, Sci. American, January (1990), pp.40; both incorporatedherein by reference)

[0048] This invention provides a composition containing an acceptablecarrier and any of an isolated, purified EHOC-1 polypeptide, an activefragment thereof, or a purified, mature protein and active fragmentsthereof, alone or in combination with each other. These polypeptides orproteins can be recombinantly derived, chemically synthesized orpurified from native sources. As used herein, the term “acceptablecarrier” encompasses any of the standard pharmaceutical carries-ems,such as phosphate buffered saline solution, water and emulsions such asan oil/water or water/oil emulsion, and various types of wetting agents.

[0049] Also provided are antibodies having specific reactivity with theEHOC- polypeptides of the subject invention. Active fragments ofantibodies are encompassed within the definition of “antibody”.

[0050] The antibodies of the invention can be produced by methods knownin the art using the invention polypeptides, proteins or portionsthereof as antigens. For example, polyclonal and monoclonal antibodiescan be produced by methods well known in the art, as described, forexample, in Harlow and Lane, Antibodies: A Laboratory Manual (ColdSpring Harbor Laboratory 1988), which is incorporated herein byreference. The polypeptide of the present invention can be used as theimmunogen in generating such antibodies. Alternatively, syntheticpeptides can be prepared (using commercially available synthesizers) andused as nimunogens. Amino acid sequences can be analyzed by methods wellknown in the art to determine whether they encode hydrophobic orhydrophilic domains of the corresponding polypeptide. Altered antibodiessuch as chimer c, humanized, CDR-grafted or bifunctional antibodies analso be produced by methods well known in the art. Such antibodies canalso be produced by hybridoma, chemical synthesis or recombinant methodsdescribed, for example, in Sambrook et al., supra., and Harlow and Lane,supra. Both anti-peptide and anti-fusion protein antibodies can be used.(see, for example, Bahouth et al., Trends Pharmacol. Sci. 12:338 (1991);Ausubel et al., Current Protocols in Molecular Biology (John Wiley andSons, N.Y. 1989) which are incorporated herein by reference).

[0051] The invention antibodies can be used to isolate the inventionpolypeptides. Additionally the antibodies are useful for detecting thepresence of the invention polypeptides, as well as analysis ofchromosome localization, and structure of functional domains. Methodsfor detecting the presence of an EHOC-1 polypeptide on the surface of acell comprise contacting the cell with an antibody that specificallybinds to the EHOC-1 polypeptide, under conditions permitting binding ofthe antibody to the polypeptides, detecting the presence of the antibodybound to the cell, and thereby detecting the presence of the inventionpolypeptide on the surface of the cell. With respect to the detection ofsuch polypeptides, the antibodies can be used for in vitro diagnostic orin vivo imaging methods.

[0052] Immunological procedures useful for in vitro detection of thetarget EHOC-1 polypeptide in a sample include immunoassays that employ adetectable antibody. Such immunoassays include, for example, ELISA,Pandex microfluorimetric assay, agglutination assays, flow cytometry,serum diagnostic assays and immunohistochemical staining procedureswhich are well known in the art. An antibody can be made detectable byvarious means well known in the art. For example, a detectable markercan be directly or indirectly attached to the antibody. Useful markersinclude, for example, radionucleotides, enzymes, fluorogens, chromogensand chemiluminescent labels.

[0053] Further, invention antibodies can be used to modulate theactivity of the EHOC-1 polypeptide in living animals, in humans, or inbiological tissues or fluids isolated therefrom. Accordingly,compositions comprising a carrier and an amount of an antibody havingspecificity for the EHOC-1 polypeptide effective to block binding ofnaturally occurring ligands to the EHOC-1 polypeptides. A monoclonalantibody directed To an epitope of an EHOC-1 polypeptide moleculepresent on the surface of a cell and having an amino acid sequencesubstantially the same as an amino acid sequence for a cell surfaceepitope of an EHOC-1 polypeptide shown in SEQ ID NO: 2, can be usefulfor this purpose.

[0054] The invention provides a transgenic non-human mammal that iscapable of expressing nucleic acids encoding an EHOC-1 polypeptide. Alsoprovided is a transgenic non-human mammal capable of expressing nucleicacids encoding an EHOC-1 polypeptide so mutated as to be incapable ofnormal activity, i.e., does not express native EHOC-1. The presentinvention also provides a transgenic non-human mammal having a genomecomprising antisense nucleic acids complementary to nucleic acidsencoding an EHOC-1 polypeptide so placed as to be transcribed intoantisense mRNA complementary to mRNA encoding an EHOC-1 polypeptide,which hybridizes thereto and, thereby, reduces the translation thereof.The nucleic acid may additionally comprise an inducible promoter and/ortissue specific regulatory elements, so that expression can be induced,or restricted to specific cell types. Examples of nucleic acids are DNAor cDNA having a coding sequence substantially the same as the codingsequence shown in SEQ ID NO: 1. An example of a non-human transgenicmammal is a transgenic mouse. Examples of tissue specificity-determiningelements are the metallothionein promoter and the L7 promoter.

[0055] Animal model systems which elucidate the physiological andbehavioral roles of EHOC-1 polypeptides are produced by creatingtransgenic animals in which the expression of the EHOC-1 polypeptide isaltered using a variety of techniques. Examples of such techniquesinclude the insertion of normal or mutant versions of nucleic acidsencoding an EHOC-1 polypeptide by microinjection, retroviral infectionor other means well known to those skilled in the art, into appropriatefertilized embryos to produce a transgenic animal. (See, for example,Hogan et al., Manipulating the Mouse Embryo: A Laboratory Manual (ColdSpring Harbor Laboratory, 1986). Another technique, homologousrecombination of mutant or normal versions of these genes with thenative gene locus in transgenic animals, may be used to alter theregulation of expression or the structure of the EHOC-1 polypeptide(see, Capecchi et al., Science 244:1288 (1989); Zimmer et al., Nature338:150 (1989); which are incorporated herein by reference). Homologousrecombination techniques are well known in the art. Homologousrecombination replaces the native (endogenous) gene with a recombinantor mutated gene to produce an animal that cannot express native(endogenous) protein but can express, for example, a mutated proteinwhich results in altered expression of the EHOC-1 polypeptide. Incontrast to homologous recombination, microinjection adds genes to thehost genome, without removing host genes. Microinjection can produce atransgenic animal that is capable of expressing both endogenous andexogenous EHOC-1 protein. Inducible promoters can be linked to thecoding region of the nucleic acids to provide a means to regulateexpression of the transgene. Tissue specific regulatory elements can belinked to the coding region to permit tissue-specific expression of thetransaene. Transgenic animal model systems are useful for in vivoscreening of compounds for identification of specific ligands, i.e.,agonists and antagonists, which activate or inhibit protein responses.

[0056] The nucleic acids, oligonucleotides (including antisense),vectors containing same, transformed host cells, polypeptides andcombinations thereof, as well as antibodies of the present invention,can be used to screen compounds in vitro to determine whether a compoundfunctions as a potential agonist or antagonist, to the inventionpolypepo:de. These in vitro screening assays provide informationregarding the function and activity of the invention polypeptide, whichcan lead to the identification and design of compounds that are capableof specific interaction with one or more types of polypeptides, peptidesor proteins.

[0057] In accordance with still another embodiment of the presentinvention, there is provided a method for identifying compounds whichbind to EHOC-1 polypeptides. The invention proteins may be employed in acompetitive binding assay. Such an assay can accommodate the rapidscreening of a large number of compounds to determine which compounds,if any, are capable of binding to EHOC-1 proteins. Subsequently, moredetailed assays can be carried out with those compounds found to bind,to further determine whether such compounds act as modulators, agonistsor antagonists of invention proteins.

[0058] In another embodiment of the invention, there is provided abioassay for identifying compounds which modulate the activity ofinvention polypeptides. According to this method, invention polypeptidesare contacted with an “unknown” or test substance (in the presence of areporter gene construct when antagonist activity is tested), theactivity of the polypeptide is monitored subsequent to the contact withthe “unknown” or test substance, and those substances which cause thereporter gene construct to be expressed are identified as functionalligands for EHOC-1 polypeptides.

[0059] In accordance with another embodiment of the present invention,transformed host cells that recombinantly express invention polypeptidescan be contacted with a test compound, and the modulating effect(s)thereof can then be evaluated by comparing the EHOC-1-mediated response(via reporter gene expression) in the presence and absence or testcompound, or by comparing the response or test cells or control cells(i.e. , cells that do not express EHOC-1 polypeptides), to the presenceof the compound.

[0060] As used herein, a compound or a signal that “modulates theactivity” of an invention polypeptide refers to a compound or a signalthat alters the activity of EHOC-1 polypeptides so that the activity ofthe invention polypeptide is different in the presence of the compoundor signal than in the absence of the compound or signal. In particular,such compounds or signals include agonists and antagonists. An agonistencompasses a compound or a signal that activates EHOC-I proteinexpression. Alternatively, an antagonist includes a compound or signalthat interferes with EHOC-1 protein expression. Typically, the effect ofan antagonist is observed as a blocking of agonist-induced proteinactivation. Antagonists include competitive and non-competitiveantagonists. A competitive antagonist (or competitive blocker) interactswith or near the site specific for agonist binding. A non-competitiveantagonist or blocker inactivates the function of the polypeptide byinteracting with a site ocher than the agonist interaction site.

[0061] As understood by those of skill in the art, assay methods foridentifying compounds that modulate EHOC-1 activity generally requirecomparison to a control. One type of a “control” is a cell or culturethat is treated substantially the same as the test cell or test cultureexposed to the compound, with the distinction that the “control” cell orculture is not exposed to the compound. For example, in methods that usevoltage clamp electro physiological procedures, the same cell can betested in the presence or absence of compound, by merely changing theexternal solution bathing the cell. Another type of “control” cell orculture may be a cell or culture that is identical to the transfectedcells, with n,e exception that the “control” cell or culture do notexpress native proteins. Accordingly, the response of the transfectedcell to compound is compared to the response (or lack thereof) of the“control” cell or culture to the same compound under the same reactionconditions.

[0062] In yet another embodiment of the present invention, theactivation of EHOC-1 polypeptides can be modulated by contacting thepolypeptides with an effective amount of at least one compoundidentified by the above-described bioassay.

[0063] The invention will now be described in greater detail withreference to the following non-limiting examples.

EXAMPLE 1 Construction of BAC Contia

[0064] BAC library construction of total human genomic DNA was describedelsewhere Shizuya et al., Proc. Natl. Acad. Sci. USA 89:8794-8797(1992). BAC clones were screened by PCR using STSs (PFKL, D21S25,D21S154, CD18). The loci of these BAC clones were confirmed bvfluorescence in-situ hybridization. The sizes of inserts of these cloneswere measured by running pulsed-field gel electrophoresis afterdigesting DNA with NotI.

EXAMPLE 2 Direct cDNA Selection

[0065] Direct selection procedures were similar to those of Morgan etal. Morgan et al., Nucleic Acid Res. 20:5173-5179 (1992) with somemodifications. Total RNA was isolated from 14 week trisomy 21 fetalbrain using TRI regions (Molecular Research Center, Inc.). Poly (A)⁺RNA. was isolated using Poly (A) Quick mRA isolation kit (STRATAGENE).Double stranded cDNA was synthesized using uperScript™ Choice System(GIBCO BRL) from 5 μg trisomy 21 fetal brain poly (A)⁺ RNA using 1 μgoligo (dT)₁₅ or 0.1 μg random hexamer. The entire synthesis reaction waspurified by Gene Clean® II kit (BIO101, Inc.) and was kinased. Sau3AIlinker was attached to cDNA and digested with Sau3AI. The reaction waspurified using Gene Clean. MboI linker I Morgan et al., Nucleic AcidRes. 20:5173-5179 (1992) was attached to the cDNA and purified by GeneClean. Product was amplified by PCR using one strand of, MboI linker (5′CCTGATGCTCGAG,-AATTC3′) as a primer. Cycling conditions were 40 cyclesof 94° C./15 seconds, 60° C./23 seconds, 72° C./2 minutes in a 100 μl of1×PCR buffer (Promega), 3 mM MgCl₂, 5.0 units of Taq polymerase(Promeaa), 2 μM primer and 0.2 mM dNTPs. Five kinds of BAC DNA (total 2.μg) was prepared using QIAGEN plasmid kit and was biotinylated usingNick Translation Kit and biotin-16-dUTP (Boehringer Manneheim). 3 μg ofheat denatured PCR amplified cDNA was annealed with 3 μg of heatdenatured COT1 DNA (BRL) in 100 μl hybridization buffer (750 mM NaCl, 50mM NaPO₄(pH7.2), 5 mM EDTA, 5×Denhardt's, 0.05% SDS and 50% formamide)at 42° C. for two hours. After prehybridization, heat denatured 1.2 μgbiotinylated BAC DNA was added and incubated at 42° C. for 16 hours.cDNA-BAC DNA hybrid was precipitated with EtOH and dissolved to 60 μl of10 mM Tris-HCl (pH 8.0), 1 mM EDTA. After addition of 40 μl 5 M NyaCl,the DNA was captured on magnetic beads (Dynabeads M-280, Dynal) at 25°C. for 1 hour with gentle rotating. The beads were washed twice bypipetting in 400 μl of 2×SSC, setting in magnet holder (MPC-E_(TM),Dynal) for 30 seconds and removing supernatant. Four times additionalwashes were done in 0.2×SSC at 68° C. for 10 minutes each withtransferring beads to new tubes at each time. cDNAs were eluted in 100μl of distilled water for 10 minutes at 80° C. with occasional mixing.The eluted cDNAs were amplified by PCR as described above. Afterrepeating selection procedure on magnetic beads twice, amplified cDNAswere digested with EcoRI and subcloned into pBluescript II.

EXAMPLE 3 Southern Blot Analysis

[0066] Gel electrophoresis of DNA was carried out on 0.8% agarose gelsin IXTBE. Transfer of nucleic acids to Nybond N+ nylon membrane(Amersham) was performed by following manufacturer's instruction. Probeswere labelled by RadPrime Labeling System (BRL) Hybridization wascarried out at 42° C. for 16 hours in 50% formamide, 5×SSPE, 5×Denhardt's 0.1% SDS, 100 μg/ml denatured salmon sperm DNA. The filterswere washed once in 1×SSC, 0.1% SDS at room temperature For 20 minutes,twice in 0.1×SSC, 0.1% SDS for 20 minutes at 65° C. Blocs were exposedto X-ray films (Kodak, X-OMAT-AR).

EXAMPLE 4 cDNA Library Screening

[0067] A trisomy 21 fetal brain cDNA library was constructed usingZAP-cDNA® synthesis kit (STRATAGENE) which generates unidirectional cDNAlibrary. Briefly, double-stranded cDNA was synthesized from Sg trisomy21 fetal brain poly(A)⁺ DINA using a hybrid oligo(dT)-XhoI linker primerwith 5-methyl dCTP, attached EcoRI linker, digested with EcoRI and XhoI,and cloned into UNI-ZAP XR vector. The library was packaged usingGigapack® II Gold packaging extract. The titer of the original librarywas 1.1×10⁶ p.f.u./package. The library was amplified once. Blue-whitecolor assay indicated that 99% clones have inserts. The average size ofthe inserts was 1.9 kb calculated from 14 clones.

[0068] The screening of trisomy 21 fetal brain cDNA library wasperformed using selected cDNA fragments. Phages were plated to anaverage density of 1×10 ⁵ per 75 cm² plate. Plaque lifts of 20 plates(2×10⁶ phages) were made using duplicated nylon membranes (Hybond-N+;Amersham). Hybridized membranes were washed to final stringency of0.2×SSC, 1.0×SDS at 65° C. The filters were exposed to X-ray filmovernight. Phages were subcloned into the plasmid vector pBluescript IISK(−) by M13−mediated excision for further analysis.

EXAMPLE 5 Northern Blot Analysis

[0069] cDNA inserts were cut out from the vector by digestion with XhoIand EcoRI. After labeling using the random priming method, the fragmentswere used a probes for Northern hybridization using Multiple TissueNorthern =lot (Clontech).

EXAMPLE 6 Metaphase Preparation

[0070] Chromosomes were prepared by using a BrdU block, (Zabel et al. inProc. Natl. Acad. Sci. USA 80:6932-6936 (1983)) with some modification.Briefly, human peripheral lymphocytes were grown for 72 hours at 37° C.in RPMI 1640 (GIBCO BRL, Gaithersburg, Md.) supplemented with -glutamine(2 mM), 15% fetal call serum, penicillin (100 IU/ml), streptomycin (0.05mg/ml) and 0.02% phytohemagglutinin. The cells were blocked n S-phase byadding 5-bromo-deoxyuridine (0.8mg/ml) for 16 hours. They were thenwashed once with HBSS (Hanks Balanced Salt Solution) (GIBCO BRL,Gaithersburg, Md.) to remove the synchronizing agent and were releasedby incubating for five to six more hours in medium supplemented with2.μg/ml of thymidine.

[0071] Cultures were harvested by the addition of 0.1 μg/ml of colcemidfor 10 minutes followed bv 0.075 KCl hypotonic solution for 15 minutesat 37° C. prior to fixation with a 3:1 mixture of methanol and aceticacid, or 1-5 minutes.

0 SEQUENCE LISTING (1) GENERAL INFORMATION: (iii) NUMBER OF SEQUENCES: 3(2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 5141 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (vii) IMMEDIATE SOURCE:(A) LIBRARY: TRISOMY 21 FETAL BRAIN cDNA LIBRARY (B) CLONE: EHOC-1(viii) POSITION IN GENOME: (A) CHROMOSOME/SEGMENT: 21q22.3 (ix) FEATURE:(A) NAME/KEY: CDS (B) LOCATION: 157..3729 (xi) SEQUENCE DESCRIPTION: SEQID NO: 1: CTGCAGGAAT CGGCACGAGG CGGCGCAACC GGCTCCGGAG CTGCCTGGCGCGGCCGGGCG 60 GGCGGCGCCG CTCAGGCTCG GGCTCCGGCT GGGCCCGGCG CGGCCTCGGGGCTGCCCATG 120 GGGCGCGGGG GGCCGGGCCG GTGACGCCGG ACGCCC ATG GAC GCC TCTGAG GAG 174 Met Asp Ala Ser Glu Glu 1 5 CCG CTG CCG CCG GTG ATC TAC ACCATG GAG AAC AAG CCC ATC GTC ACC 222 Pro Leu Pro Pro Val Ile Tyr Thr MetGlu Asn Lys Pro Ile Val Thr 10 15 20 TGT GCT GGA GAT CAG AAT TTA TTT ACCTCT GTT TAT CCA ACG CTC TCT 270 Cys Ala Gly Asp Gln Asn Leu Phe Thr SerVal Tyr Pro Thr Leu Ser 25 30 35 CAG CAG CTT CCA AGA GAA CCA ATG GAA TGGAGA AGG TCC TAT GGC CGG 318 Gln Gln Leu Pro Arg Glu Pro Met Glu Trp ArgArg Ser Tyr Gly Arg 40 45 50 GCT CCG AAG ATG ATT CAC CTA GAG TCT AAC TTTGTT CAA TTC AAA GAG 366 Ala Pro Lys Met Ile His Leu Glu Ser Asn Phe ValGln Phe Lys Glu 55 60 65 70 GAG CTG CTG CCC AAA GAA GGA AAC AAA GCT CTGCTC ACG TTT CCC TTC 414 Glu Leu Leu Pro Lys Glu Gly Asn Lys Ala Leu LeuThr Phe Pro Phe 75 80 85 CTC CAT ATT TAC TGG ACA GAG TGC TGT GAT ACC GAAGTG TAT AAA GCT 462 Leu His Ile Tyr Trp Thr Glu Cys Cys Asp Thr Glu ValTyr Lys Ala 90 95 100 ACA GTA AAA GAT GAC CTC ACC AAG TGG CAG AAT GTTCTG AAG GCT CAT 510 Thr Val Lys Asp Asp Leu Thr Lys Trp Gln Asn Val LeuLys Ala His 105 110 115 AGC TCT GTG GAC TGG TTA ATA GTG ATA GTT GAA AATGAT GCC AAG AAA 558 Ser Ser Val Asp Trp Leu Ile Val Ile Val Glu Asn AspAla Lys Lys 120 125 130 AAA AAC AAA ACC AAC ATC CTT CCC CGA ACC TCT ATTGTG GAC AAA ATA 606 Lys Asn Lys Thr Asn Ile Leu Pro Arg Thr Ser Ile ValAsp Lys Ile 135 140 145 150 AGA AAT GAT TTT TGT AAT AAA CAG AGT GAC AGGTGT GTT GTG CTC TCC 654 Arg Asn Asp Phe Cys Asn Lys Gln Ser Asp Arg CysVal Val Leu Ser 155 160 165 GAC CCC TTG AAG GAC TCT TCT CGA ACT CAG GAATCC TGG AAT GCC TTC 702 Asp Pro Leu Lys Asp Ser Ser Arg Thr Gln Glu SerTrp Asn Ala Phe 170 175 180 CTG ACC AAA CTC AGG ACA TTG CTT CTT ATG TCTTTT ACC AAA AAC CTA 750 Leu Thr Lys Leu Arg Thr Leu Leu Leu Met Ser PheThr Lys Asn Leu 185 190 195 GGC AAG TTT GAG GAT GAC ATG AGA ACC TTG AGGGAG AAG AGG ACT GAG 798 Gly Lys Phe Glu Asp Asp Met Arg Thr Leu Arg GluLys Arg Thr Glu 200 205 210 CCA GGC TGG AGC TTT TGT GAA TAT TTC ATG GTTCAG GAG GAG CTT GCC 846 Pro Gly Trp Ser Phe Cys Glu Tyr Phe Met Val GlnGlu Glu Leu Ala 215 220 225 230 TTT GTT TTC GAG ATG CTG CAG CAG TTC GAGGAC GCC CTG GTG CAG TAC 894 Phe Val Phe Glu Met Leu Gln Gln Phe Glu AspAla Leu Val Gln Tyr 235 240 245 GAC GAA CTG GAC GCC CTC TTC TCT CAG TATGTG GTC AAC TTC GGG GCC 942 Asp Glu Leu Asp Ala Leu Phe Ser Gln Tyr ValVal Asn Phe Gly Ala 250 255 260 GGG GAT GGT GCC AAC TGG CTG ACT TTT TTCTGC CAG CCA GTG AAG AGC 990 Gly Asp Gly Ala Asn Trp Leu Thr Phe Phe CysGln Pro Val Lys Ser 265 270 275 TGG AAC GGA TTG ATC CTC CGA AAA CCC ATAGAT ATG GAG AAG CGG GAA 1038 Trp Asn Gly Leu Ile Leu Arg Lys Pro Ile AspMet Glu Lys Arg Glu 280 285 290 TCG ATC CAG AGG CGA GAA GCC ACC CTG TTAGAT CTG CGC AGT TAC CTG 1086 Ser Ile Gln Arg Arg Glu Ala Thr Leu Leu AspLeu Arg Ser Tyr Leu 295 300 305 310 TTC TCT CGC CAG TGC ACC TTG CTG CTCTTC CTG CAG AGG CCG TGG GAG 1134 Phe Ser Arg Gln Cys Thr Leu Leu Leu PheLeu Gln Arg Pro Trp Glu 315 320 325 GTG GCC CAG CGC GCC CTA GAG CTG CTGCAC AAC TGC GTG CAG GAA CTG 1182 Val Ala Gln Arg Ala Leu Glu Leu Leu HisAsn Cys Val Gln Glu Leu 330 335 340 AAG CTC TTA GAA GTC TCT GTC CCA CCTGGT GCT CTG GAC TGC TGG GTG 1230 Lys Leu Leu Glu Val Ser Val Pro Pro GlyAla Leu Asp Cys Trp Val 345 350 355 TTT CTG AGC TGT CTG GAG GTG TTG CAGAGG ATA GAA GGC TGC TGT GAC 1278 Phe Leu Ser Cys Leu Glu Val Leu Gln ArgIle Glu Gly Cys Cys Asp 360 365 370 CGG GCA CAG ATC GAC TCA AAC ATT GCCCAC ACT GTG GGG CTA TGG AGC 1326 Arg Ala Gln Ile Asp Ser Asn Ile Ala HisThr Val Gly Leu Trp Ser 375 380 385 390 TAT GCC ACA GAA AAG TTA AAG TCCTTG GGC TAT CTA TGT GGA CTT GTG 1374 Tyr Ala Thr Glu Lys Leu Lys Ser LeuGly Tyr Leu Cys Gly Leu Val 395 400 405 TCA GAG AAA GGA CCT AAC TCA GAAGAT CTC AAC AGG ACA GTT GAC CTT 1422 Ser Glu Lys Gly Pro Asn Ser Glu AspLeu Asn Arg Thr Val Asp Leu 410 415 420 TTG GCA GGT TTG GGA GCT GAG CGACCA GAA ACA GCC AAC ACA GCT CAG 1470 Leu Ala Gly Leu Gly Ala Glu Arg ProGlu Thr Ala Asn Thr Ala Gln 425 430 435 AGT CCT TAT AAG AAA CTG AAA GAAGCA TTA TCG TCA GTG GAA GCT TTT 1518 Ser Pro Tyr Lys Lys Leu Lys Glu AlaLeu Ser Ser Val Glu Ala Phe 440 445 450 GAA AAA CAC TAC TTA GAT TTG TCCCAT GCC ACC ATT GAA ATG TAT ACA 1566 Glu Lys His Tyr Leu Asp Leu Ser HisAla Thr Ile Glu Met Tyr Thr 455 460 465 470 AGC ATT GGG AGG ATT CGA TCTGCT AAG TTT GTT GGA AAA GAT CTG GCA 1614 Ser Ile Gly Arg Ile Arg Ser AlaLys Phe Val Gly Lys Asp Leu Ala 475 480 485 GAG TTT TAC ATG AGG AAA AAGGCT CCA CAA AAG GCA GAA ATC TAT CTT 1662 Glu Phe Tyr Met Arg Lys Lys AlaPro Gln Lys Ala Glu Ile Tyr Leu 490 495 500 CAA GGA GCA CTG AAA AAC TACCTG GCT GAG GGC TGG GCA CTC CCC ATC 1710 Gln Gly Ala Leu Lys Asn Tyr LeuAla Glu Gly Trp Ala Leu Pro Ile 505 510 515 ACA CAC ACA AGG AAG CAG CTGGCC GAA TGT CAA AAG CAC CTT GGA CAA 1758 Thr His Thr Arg Lys Gln Leu AlaGlu Cys Gln Lys His Leu Gly Gln 520 525 530 ATT GAA AAC TAC CTG CAG ACCAGC AGC CTC TTA GCC AGT GAC CAC CAC 1806 Ile Glu Asn Tyr Leu Gln Thr SerSer Leu Leu Ala Ser Asp His His 535 540 545 550 CTC ACT GAA GAG GAG CGCAAG CAC TTC TGC CAG GAG ATA CTT GAC TTT 1854 Leu Thr Glu Glu Glu Arg LysHis Phe Cys Gln Glu Ile Leu Asp Phe 555 560 565 GCC AGC CAG CCG TCA GACAGC CCA GGT CAT AAG ATA GTG CTA CCC ATG 1902 Ala Ser Gln Pro Ser Asp SerPro Gly His Lys Ile Val Leu Pro Met 570 575 580 CAT TCC TTT GCA CAA CTGCGA GAT CTC CAT TTT GAT CCC TCC AAT GCC 1950 His Ser Phe Ala Gln Leu ArgAsp Leu His Phe Asp Pro Ser Asn Ala 585 590 595 GTG GTC CAC GTG GGC GGCGTT TTG TGC GTT GAG ATA ACC ATG TAC AGC 1998 Val Val His Val Gly Gly ValLeu Cys Val Glu Ile Thr Met Tyr Ser 600 605 610 CAG ATG CCT GTG CCT GTTCAC GTG GAG CAG ATT GTG GTC AAT GTC CAC 2046 Gln Met Pro Val Pro Val HisVal Glu Gln Ile Val Val Asn Val His 615 620 625 630 TTC AGC ATT GAG AAAAAC AGC TAC CGG AAG ACT GCG GAG TGG CTT ACC 2094 Phe Ser Ile Glu Lys AsnSer Tyr Arg Lys Thr Ala Glu Trp Leu Thr 635 640 645 AAG CAC AAG ACG TCCAAT GGG ATC ATT AAC TTT CCA CCC GAG ACC GCA 2142 Lys His Lys Thr Ser AsnGly Ile Ile Asn Phe Pro Pro Glu Thr Ala 650 655 660 CCT TTC CCT GTA TCCCAA AAC AGT TTG CCC GCG CTG GAG TTG TAT GAA 2190 Pro Phe Pro Val Ser GlnAsn Ser Leu Pro Ala Leu Glu Leu Tyr Glu 665 670 675 ATG TTT GAG AGA AGCCCA TCT GAT AAC TCC TTG AAC ACG ACT GGG ATT 2238 Met Phe Glu Arg Ser ProSer Asp Asn Ser Leu Asn Thr Thr Gly Ile 680 685 690 ATC TGC AGA AAC GTCCAC ATG CTC CTG AGA AGG CAG GAG AGC AGC TCC 2286 Ile Cys Arg Asn Val HisMet Leu Leu Arg Arg Gln Glu Ser Ser Ser 695 700 705 710 TCT CTA GAG ATGCCC TCA GGG GTG GCT CTG GAG GAG GGT GCC CAC GTG 2334 Ser Leu Glu Met ProSer Gly Val Ala Leu Glu Glu Gly Ala His Val 715 720 725 CTG AGG TGC AGCCAC GTG ACC CTG GAA CCA GGG GCC AAC CAG ATA ACA 2382 Leu Arg Cys Ser HisVal Thr Leu Glu Pro Gly Ala Asn Gln Ile Thr 730 735 740 TTC AGG ACT CAGGCC AAG GAA CCT GGA ACG TAT ACA CTC AGG CAG CTG 2430 Phe Arg Thr Gln AlaLys Glu Pro Gly Thr Tyr Thr Leu Arg Gln Leu 745 750 755 TGC GCC TCG GTGGGC TCC GTG TGG TTC GTC CTC CCT CAC ATC TAC CCC 2478 Cys Ala Ser Val GlySer Val Trp Phe Val Leu Pro His Ile Tyr Pro 760 765 770 ATT GTG CAG TACGAC GTG TAC TCA CAG GAG CCC CAG CTG CAC GTG GAG 2526 Ile Val Gln Tyr AspVal Tyr Ser Gln Glu Pro Gln Leu His Val Glu 775 780 785 790 CCG CTG GCTGAT AGC CTT CTG GCA GGC ATT CCT CAG AGA GTC AAG TTC 2574 Pro Leu Ala AspSer Leu Leu Ala Gly Ile Pro Gln Arg Val Lys Phe 795 800 805 ACT GTC ACTACC GGC CAT GAT ACG ATA AAG AAT GGA GAC AGC CTG CAG 2622 Thr Val Thr ThrGly His Asp Thr Ile Lys Asn Gly Asp Ser Leu Gln 810 815 820 CTT AGC AATGCC GAA GCC ATG CTC ATC CTG TGC CAG GCG GAG AGC AGG 2670 Leu Ser Asn AlaGlu Ala Met Leu Ile Leu Cys Gln Ala Glu Ser Arg 825 830 835 GCT GTG GTCTAC TCC AAC ACG AGA GAA CAG TCT TCT GAG GCC GCG CTC 2718 Ala Val Val TyrSer Asn Thr Arg Glu Gln Ser Ser Glu Ala Ala Leu 840 845 850 CGG ATT CAGTCC TCC GAC AAG GTC ACG AGC ATC AGT CTG CCT GTT GCG 2766 Arg Ile Gln SerSer Asp Lys Val Thr Ser Ile Ser Leu Pro Val Ala 855 860 865 870 CCT GCGTAC CAC GTG ATC GAA TTT GAA CTG GAA GTT CTC TCT TTA CCT 2814 Pro Ala TyrHis Val Ile Glu Phe Glu Leu Glu Val Leu Ser Leu Pro 875 880 885 TCA GCCCCA GCA CTC GGA GGG GAG AGT GAC ATG CTG GGG ATG GCA GAG 2862 Ser Ala ProAla Leu Gly Gly Glu Ser Asp Met Leu Gly Met Ala Glu 890 895 900 CCC CACAGG AAG CAT AAG GAC AAA CAG AGA ACT GGC CGC TGC ATG GTT 2910 Pro His ArgLys His Lys Asp Lys Gln Arg Thr Gly Arg Cys Met Val 905 910 915 ACC ACAGAC CAC AAA GTG TCG ATT GAC TGC CCG TGG TCC ATC TAC TCC 2958 Thr Thr AspHis Lys Val Ser Ile Asp Cys Pro Trp Ser Ile Tyr Ser 920 925 930 ACA GTCATC GCA CTG ACC TTC AGC GTA CCC TTC AGG ACC ACA CAC AGC 3006 Thr Val IleAla Leu Thr Phe Ser Val Pro Phe Arg Thr Thr His Ser 935 940 945 950 CTCCTG TCC TCA GGA ACA CGG AAA TAT GTT CAA GTT TGT GTC CAG AAT 3054 Leu LeuSer Ser Gly Thr Arg Lys Tyr Val Gln Val Cys Val Gln Asn 955 960 965 TTGTCA GAA CTT GAC TTT CAG CTG TCA GAT AGT TAT CTT GTA GAT ACC 3102 Leu SerGlu Leu Asp Phe Gln Leu Ser Asp Ser Tyr Leu Val Asp Thr 970 975 980 GGTGAT AGT ACC GAC CTG CAA CTA GTA CCA CTG AAC ACG CAG TCC CAG 3150 Gly AspSer Thr Asp Leu Gln Leu Val Pro Leu Asn Thr Gln Ser Gln 985 990 995 CAGCCC ATC TAC AGC AAG CAG TCG GTG TTC TTC GTC TGG GAA CTC AAG 3198 Gln ProIle Tyr Ser Lys Gln Ser Val Phe Phe Val Trp Glu Leu Lys 1000 1005 1010TGG ACA GAA GAG CCT CCC CCT TCT CTG CAT TGC CGG TTC TCT GTT GGA 3246 TrpThr Glu Glu Pro Pro Pro Ser Leu His Cys Arg Phe Ser Val Gly 1015 10201025 1030 TTT TCC CCA GCT TCT GAG GAA CAG CTG TCT ATC TCC TTA AAG CCGTAT 3294 Phe Ser Pro Ala Ser Glu Glu Gln Leu Ser Ile Ser Leu Lys Pro Tyr1035 1040 1045 ACT TAT GAA TTT AAA GTG GAA AAT TTT TTT ACA TTA TAC AACGTG AAG 3342 Thr Tyr Glu Phe Lys Val Glu Asn Phe Phe Thr Leu Tyr Asn ValLys 1050 1055 1060 GCT GAG ATC TTT CCC CCT TCG GGA ATG GAG TAT TGC AGAACA GGC TCC 3390 Ala Glu Ile Phe Pro Pro Ser Gly Met Glu Tyr Cys Arg ThrGly Ser 1065 1070 1075 CTC TGC TCC CTG GAG GTT TTG ATC ACG AGG CTC TCAGAC CTC TTG GAG 3438 Leu Cys Ser Leu Glu Val Leu Ile Thr Arg Leu Ser AspLeu Leu Glu 1080 1085 1090 GTG GAT AAA GAT GAA GCA CTG ACT GAA TCT GATGAG CAT TTT TCG ACA 3486 Val Asp Lys Asp Glu Ala Leu Thr Glu Ser Asp GluHis Phe Ser Thr 1095 1100 1105 1110 AAG CTT ATG TAT GAA GTT GTC GAC AACAGT AGC AAC TGG GCA GTG TGT 3534 Lys Leu Met Tyr Glu Val Val Asp Asn SerSer Asn Trp Ala Val Cys 1115 1120 1125 GGG AAA AGC TGC GGT GTC ATC TCCATG CCA GTG GCT GCT CGG GCC ACT 3582 Gly Lys Ser Cys Gly Val Ile Ser MetPro Val Ala Ala Arg Ala Thr 1130 1135 1140 CAC AGG GTC CAC ATG GAA GTGATG CCG CTC TTC GCC GGG TAT CTC CCC 3630 His Arg Val His Met Glu Val MetPro Leu Phe Ala Gly Tyr Leu Pro 1145 1150 1155 CTG CCC GAC GTC AGG CTGTTC AAG TAC CTC CCC CAT CAT TCT GCA CAC 3678 Leu Pro Asp Val Arg Leu PheLys Tyr Leu Pro His His Ser Ala His 1160 1165 1170 TCC TCC CAA CTG GACGCT GAC AGC TGG ATA GAA AAC GCA GCC TGT CAG 3726 Ser Ser Gln Leu Asp AlaAsp Ser Trp Ile Glu Asn Ala Ala Cys Gln 1175 1180 1185 1190 TAGACAAGCACGGGGACGAC CAGCCGGACA GCAGCAGCCT CAAGAGCAGG GGCAGCGTGC 3786 ATTCGGCCTGCAGCAGCGAG CACAAAGGCC TACCCATGCC CCGGCTGCAG GCACTGCCGG 3846 CCGGCCAGGTCTTCAACTCC AGCTCGGGCA CACAAGTCCT GGTCATCCCC AGCCAAGATG 3906 ACCACGTCCTGGAAGTCAGT GTAACATGAC AACGCCAGGG TGAACACACG CCACTTCCCA 3966 GCTAGGAGTGCACTTTATGG GACTGTGACT GGACTCTTCC GTTCTGGCTC CAGCCAGACC 4026 TTCAGTGGTCCTGCCTGGCC GTGGGGACAT CAGAGAGTGT CATCACGCAG CTGGCCAGCT 4086 GAGTTCTGTTGTTGTTTTCA TGCCGCCTGT GATCTCAGAT TCCTGCTTTT CTCACCCCGT 4146 CCCCATGCTGGTGTCCGACG CCGCTTACTC AGAGCCCTGG CCTCCCTCCC CCTACCTCAC 4206 ACGCTGCTCATGAAAGTTTC CACCCACGCT GTCTCCACGG AACAGCCTCC GTCTGCTGGC 4266 TCTTCGTGGAAGGCCATTTG TCTTTCAGGT AGACACTCAG CAGCCCTCAC GGTCTTAGTG 4326 ACGTGTGTGCCTTTCTGGTC ACACAGCTGC CCAGTTTCCT GATCGGGGTG GATTTGTGTC 4386 CCCTAAGGGGTAAAACAGCC GTTTACCGCA GATCCTCTCA TACACCCTTC TAGGGGAGGC 4446 GGGTGGGGGAGGGAGGGATC ATAACCCCTT CTGTGCCTTG GGATGCCGGA GCTGGGGGAC 4506 CTGGAGGCCCATCAGCCGGA GCCACGTGAA AGGTACTGAA GAAAGCTGAG ACCCGGCTGT 4566 GAGGAGCGCCTCAGCGGTGA GGTGGTTTAG GGATAAATGT TTCTGGAACC CTGTGGTCCC 4626 CCATAATGTTGATAGATATC ATATGCACTG GGAGTTAAAT ATATTTAATT TAATGATCAT 4686 TATATATGTGGGGGTTAATA TGTTGTTTTT CTGTCCCTTT AAAGTCTTTA CATGTAATTG 4746 TAGCTGTATAATCGTTATTT TTCTTTTGCA TCTTAAGTCT TAGAAATTAA GATATTCCAT 4806 CGTGAGGATGAGAGAGGTCC TCAGTGTGTT TTTGGTCTGG TTGTAGGGAA GGACTCAAGT 4866 CCTGGAATGTCCTCCACTGG TCTACTGAGT TGCAGTCACA CTGTTCCAAT GGATTATTTG 4926 CTTTCGGTTGTAAATTTAAT TGTACATATG GTTGATTTAT TATTTTTAAA AATACAGACT 4986 AACTGATGTAATGTTTATGT ATAAGTTGCA CCAAAAATCA AGGACAAAAA TAAGTGTGTT 5046 TGTTTTTACAGGTGTGAAAG TCACAGCTTG TAAATAAGTG TTGTATGTAT TAAACCTTTT 5106 CCAGTTCTCCAAAAAAAAAA AAAAAAAAAA AAAAA 5141 (2) INFORMATION FOR SEQ ID NO: 2: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 1190 amino acids (B) TYPE: aminoacid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 2: Met Asp Ala Ser Glu Glu Pro Leu Pro Pro ValIle Tyr Thr Met Glu 1 5 10 15 Asn Lys Pro Ile Val Thr Cys Ala Gly AspGln Asn Leu Phe Thr Ser 20 25 30 Val Tyr Pro Thr Leu Ser Gln Gln Leu ProArg Glu Pro Met Glu Trp 35 40 45 Arg Arg Ser Tyr Gly Arg Ala Pro Lys MetIle His Leu Glu Ser Asn 50 55 60 Phe Val Gln Phe Lys Glu Glu Leu Leu ProLys Glu Gly Asn Lys Ala 65 70 75 80 Leu Leu Thr Phe Pro Phe Leu His IleTyr Trp Thr Glu Cys Cys Asp 85 90 95 Thr Glu Val Tyr Lys Ala Thr Val LysAsp Asp Leu Thr Lys Trp Gln 100 105 110 Asn Val Leu Lys Ala His Ser SerVal Asp Trp Leu Ile Val Ile Val 115 120 125 Glu Asn Asp Ala Lys Lys LysAsn Lys Thr Asn Ile Leu Pro Arg Thr 130 135 140 Ser Ile Val Asp Lys IleArg Asn Asp Phe Cys Asn Lys Gln Ser Asp 145 150 155 160 Arg Cys Val ValLeu Ser Asp Pro Leu Lys Asp Ser Ser Arg Thr Gln 165 170 175 Glu Ser TrpAsn Ala Phe Leu Thr Lys Leu Arg Thr Leu Leu Leu Met 180 185 190 Ser PheThr Lys Asn Leu Gly Lys Phe Glu Asp Asp Met Arg Thr Leu 195 200 205 ArgGlu Lys Arg Thr Glu Pro Gly Trp Ser Phe Cys Glu Tyr Phe Met 210 215 220Val Gln Glu Glu Leu Ala Phe Val Phe Glu Met Leu Gln Gln Phe Glu 225 230235 240 Asp Ala Leu Val Gln Tyr Asp Glu Leu Asp Ala Leu Phe Ser Gln Tyr245 250 255 Val Val Asn Phe Gly Ala Gly Asp Gly Ala Asn Trp Leu Thr PhePhe 260 265 270 Cys Gln Pro Val Lys Ser Trp Asn Gly Leu Ile Leu Arg LysPro Ile 275 280 285 Asp Met Glu Lys Arg Glu Ser Ile Gln Arg Arg Glu AlaThr Leu Leu 290 295 300 Asp Leu Arg Ser Tyr Leu Phe Ser Arg Gln Cys ThrLeu Leu Leu Phe 305 310 315 320 Leu Gln Arg Pro Trp Glu Val Ala Gln ArgAla Leu Glu Leu Leu His 325 330 335 Asn Cys Val Gln Glu Leu Lys Leu LeuGlu Val Ser Val Pro Pro Gly 340 345 350 Ala Leu Asp Cys Trp Val Phe LeuSer Cys Leu Glu Val Leu Gln Arg 355 360 365 Ile Glu Gly Cys Cys Asp ArgAla Gln Ile Asp Ser Asn Ile Ala His 370 375 380 Thr Val Gly Leu Trp SerTyr Ala Thr Glu Lys Leu Lys Ser Leu Gly 385 390 395 400 Tyr Leu Cys GlyLeu Val Ser Glu Lys Gly Pro Asn Ser Glu Asp Leu 405 410 415 Asn Arg ThrVal Asp Leu Leu Ala Gly Leu Gly Ala Glu Arg Pro Glu 420 425 430 Thr AlaAsn Thr Ala Gln Ser Pro Tyr Lys Lys Leu Lys Glu Ala Leu 435 440 445 SerSer Val Glu Ala Phe Glu Lys His Tyr Leu Asp Leu Ser His Ala 450 455 460Thr Ile Glu Met Tyr Thr Ser Ile Gly Arg Ile Arg Ser Ala Lys Phe 465 470475 480 Val Gly Lys Asp Leu Ala Glu Phe Tyr Met Arg Lys Lys Ala Pro Gln485 490 495 Lys Ala Glu Ile Tyr Leu Gln Gly Ala Leu Lys Asn Tyr Leu AlaGlu 500 505 510 Gly Trp Ala Leu Pro Ile Thr His Thr Arg Lys Gln Leu AlaGlu Cys 515 520 525 Gln Lys His Leu Gly Gln Ile Glu Asn Tyr Leu Gln ThrSer Ser Leu 530 535 540 Leu Ala Ser Asp His His Leu Thr Glu Glu Glu ArgLys His Phe Cys 545 550 555 560 Gln Glu Ile Leu Asp Phe Ala Ser Gln ProSer Asp Ser Pro Gly His 565 570 575 Lys Ile Val Leu Pro Met His Ser PheAla Gln Leu Arg Asp Leu His 580 585 590 Phe Asp Pro Ser Asn Ala Val ValHis Val Gly Gly Val Leu Cys Val 595 600 605 Glu Ile Thr Met Tyr Ser GlnMet Pro Val Pro Val His Val Glu Gln 610 615 620 Ile Val Val Asn Val HisPhe Ser Ile Glu Lys Asn Ser Tyr Arg Lys 625 630 635 640 Thr Ala Glu TrpLeu Thr Lys His Lys Thr Ser Asn Gly Ile Ile Asn 645 650 655 Phe Pro ProGlu Thr Ala Pro Phe Pro Val Ser Gln Asn Ser Leu Pro 660 665 670 Ala LeuGlu Leu Tyr Glu Met Phe Glu Arg Ser Pro Ser Asp Asn Ser 675 680 685 LeuAsn Thr Thr Gly Ile Ile Cys Arg Asn Val His Met Leu Leu Arg 690 695 700Arg Gln Glu Ser Ser Ser Ser Leu Glu Met Pro Ser Gly Val Ala Leu 705 710715 720 Glu Glu Gly Ala His Val Leu Arg Cys Ser His Val Thr Leu Glu Pro725 730 735 Gly Ala Asn Gln Ile Thr Phe Arg Thr Gln Ala Lys Glu Pro GlyThr 740 745 750 Tyr Thr Leu Arg Gln Leu Cys Ala Ser Val Gly Ser Val TrpPhe Val 755 760 765 Leu Pro His Ile Tyr Pro Ile Val Gln Tyr Asp Val TyrSer Gln Glu 770 775 780 Pro Gln Leu His Val Glu Pro Leu Ala Asp Ser LeuLeu Ala Gly Ile 785 790 795 800 Pro Gln Arg Val Lys Phe Thr Val Thr ThrGly His Asp Thr Ile Lys 805 810 815 Asn Gly Asp Ser Leu Gln Leu Ser AsnAla Glu Ala Met Leu Ile Leu 820 825 830 Cys Gln Ala Glu Ser Arg Ala ValVal Tyr Ser Asn Thr Arg Glu Gln 835 840 845 Ser Ser Glu Ala Ala Leu ArgIle Gln Ser Ser Asp Lys Val Thr Ser 850 855 860 Ile Ser Leu Pro Val AlaPro Ala Tyr His Val Ile Glu Phe Glu Leu 865 870 875 880 Glu Val Leu SerLeu Pro Ser Ala Pro Ala Leu Gly Gly Glu Ser Asp 885 890 895 Met Leu GlyMet Ala Glu Pro His Arg Lys His Lys Asp Lys Gln Arg 900 905 910 Thr GlyArg Cys Met Val Thr Thr Asp His Lys Val Ser Ile Asp Cys 915 920 925 ProTrp Ser Ile Tyr Ser Thr Val Ile Ala Leu Thr Phe Ser Val Pro 930 935 940Phe Arg Thr Thr His Ser Leu Leu Ser Ser Gly Thr Arg Lys Tyr Val 945 950955 960 Gln Val Cys Val Gln Asn Leu Ser Glu Leu Asp Phe Gln Leu Ser Asp965 970 975 Ser Tyr Leu Val Asp Thr Gly Asp Ser Thr Asp Leu Gln Leu ValPro 980 985 990 Leu Asn Thr Gln Ser Gln Gln Pro Ile Tyr Ser Lys Gln SerVal Phe 995 1000 1005 Phe Val Trp Glu Leu Lys Trp Thr Glu Glu Pro ProPro Ser Leu His 1010 1015 1020 Cys Arg Phe Ser Val Gly Phe Ser Pro AlaSer Glu Glu Gln Leu Ser 1025 1030 1035 1040 Ile Ser Leu Lys Pro Tyr ThrTyr Glu Phe Lys Val Glu Asn Phe Phe 1045 1050 1055 Thr Leu Tyr Asn ValLys Ala Glu Ile Phe Pro Pro Ser Gly Met Glu 1060 1065 1070 Tyr Cys ArgThr Gly Ser Leu Cys Ser Leu Glu Val Leu Ile Thr Arg 1075 1080 1085 LeuSer Asp Leu Leu Glu Val Asp Lys Asp Glu Ala Leu Thr Glu Ser 1090 10951100 Asp Glu His Phe Ser Thr Lys Leu Met Tyr Glu Val Val Asp Asn Ser1105 1110 1115 1120 Ser Asn Trp Ala Val Cys Gly Lys Ser Cys Gly Val IleSer Met Pro 1125 1130 1135 Val Ala Ala Arg Ala Thr His Arg Val His MetGlu Val Met Pro Leu 1140 1145 1150 Phe Ala Gly Tyr Leu Pro Leu Pro AspVal Arg Leu Phe Lys Tyr Leu 1155 1160 1165 Pro His His Ser Ala His SerSer Gln Leu Asp Ala Asp Ser Trp Ile 1170 1175 1180 Glu Asn Ala Ala CysGln 1185 1190 (2) INFORMATION FOR SEQ ID NO: 3: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA(genomic) (vi) ORIGINAL SOURCE: (A) ORGANISM: Moraxella bovis (C)INDIVIDUAL ISOLATE: MboI linker (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:CCTGATGCTC GAGTGAATTC 20

What is claimed is:
 1. Isolated nucleic acid encoding a human EHOC-1polypeptide.
 2. Isolated nucleic acid according to claim 1, wherein saidnucleic comprises DNA.
 3. DNA according to claim 2, wherein said DNA isa cDNA.
 4. DNA according to claim 2, wherein said DNA encodes the aminoacid sequence set forth in SEQ ID NO:
 2. 5. DNA according to claim 2,wherein said DNA hybridizes under high stringency coditions tosubstantially the entire coding sequence (nucleotides 157-3726) setforth in SEQ ID NO:
 2. 6. DNA according to claim 2, wherein said DNA hassubstantially the same nucleotide sequence as the nucleotide sequenceset forth in SEQ ID NO:
 1. 7. A vector comprising DNA according to claim2.
 8. A host cell containing a vector according to claim 7, wherein saidcell is a procaryotic cell or a eucaryotic cell.
 9. A host cellaccording to claim 8, wherein said cell expresses a functional EHOC-1protein.
 10. A nucleic acid probe comprising at least 15 nucleotidescapable of specifically hybridizing with a sequence of nucleic acids ofthe nucleotide sequence set forth in SEQ ID NO:
 1. 11. A nucleic acidprobe according to claim 10, wherein said probe is labeled with adetectable marker.
 12. A kit for detecting mutations and aneuploidies inchromosome 21 at locus q22.3 comprising a plurality of probes, whereineach probe comprises a nucleic acid sequence having at least 15 bp ofcontiguous nucleotides capable of specifically hybridizing with asequence of nucleic acids of the nucleotide sequence set forth in SEQ IDNO: 1, and wherein each individual probe corresponds to a specific locuson chromosome 21q22.3
 13. Isolated mRNA complementary to DNA accordingto claim
 2. 14. An oligonucleotide composition comprising chemicalanalogues of the nucleic acid of claim 2 operatively linked to apromoter of RNA transcription.
 15. An antisense oligonucleotide capableof specifically binding to and modulating the translation of mRNAaccording to claim
 13. 16. Isolated EHOC-1 polypeptide and functionalequivalents thereof.
 17. Isolated EHOC-1 polypeptide according to claim16, wherein said polypeptide has substantially the same amino acidsequence as that set forth in SEQ ID NO:
 2. 18. Isolated EHOC-1polypeptide according to claim 16, wherein said polypeptide has the sameamino acid sequence as that set forth in SEQ ID NO:
 2. 19. IsolatedEHOC-1 polypeptide according to claim 16, wherein said polypeptide isencoded by a nucleotide sequence that is substantially the samenucleotide sequence as that set forth in SEQ ID NO:
 1. 20. IsolatedEHOC-1 polypeptide according to claim 16, wherein said polypeptide isencoded by the nucleotide sequence set forth in SEQ ID NO:
 1. 21. AnEHOC-1 polypeptide expressed recombinantly in a host cell.
 22. An EHOC-1polypeptide according to claim 21, wherein said polypeptide is encodedby a nucleotide sequence that is substantially the same as thenucleotide sequence set forth in SEQ ID NO:
 1. 23. An EHOC-1 polypeptideaccording to claim 21, wherein said polypeptide is encoded by thenucleotide sequence set forth in SEQ ID NO:
 1. 24. An antibody thatspecifically binds to a determinant on a human EHOC-1 protein or activefragment thereof.
 25. An antibody according to claim 24, wherein saidantibody is a monoclonal antibody.
 26. An antibody according to claim24, wherein said antibody is a polyclonal antibody.
 27. A compositioncomprising an amount of the antisense oligonucleotide according to claim13 effective to modulate expression of a human EHOC-1 polypeptide and anacceptable hydrophobic carrier capable of passing through a cellmembrane.
 28. A composition according to claim 27, wherein theoligonucleotide is coupled to a substance which inactivates mRNA.
 29. Acomposition according to claim 28, wherein said substance is a ribozyme.30. A composition comprising an amount of an antibody according to claim24 effective to block binding of naturally occurring ligands to thehuman EHOC-1 receptor and an acceptable carrier.
 31. A transgenicnonhuman mammal expressing DNA encoding a human EHOC-1 polypeptide. 32.A transgenic nonhuman mammal according to claim 31, wherein said DNAencoding said polypeptide has been mutated as to be incapable of normalpolypeptide activity, and wherein the polypeptide so expressed is notnative EHOC-1 polypeptide.
 33. A transgenic nonhuman mammal, the genomeof which comprising antisense DNA complementary to DNA encoding a humanEHOC-1 polypeptide, wherein said antisense DNA s transcribed intoantisense mRNA complementary to mRNA encoding a human EHOC-1polypeptide.
 34. A transgenic nonhuman mammal according to claim 31,wherein said DNA -s operatively linked to an inducible promoter.
 35. Atransgenic nonhuman mammal according to claim 31, wherein said DNA isoperatively linked to tissue specific regulatory elements.
 36. Atransgenic nonhuman mammal according to claim 31, wherein the transgenicnonhuman mammal is a mouse.
 37. A method for identifying nucleic acidsencoding a human EHOC-1 protein, said method comprising: contacting asample containing nucleic acids with a probe according to claim 11,wherein said contacting is effected under high stringency hybridizationconditions, and identifying compounds which hybridize thereto.
 38. Amethod for identifying compound(s) which bind to a human EHOC-1polypeptide, said method comprising contacting cells according to claim9 with said compound(s) and identifying compounds which bind thereto.39. A method for detecting the presence of a human EHOC-1 polypeptide ona cell surface, said method comprising contacting a test cell with anantibody according to claim 24, detecting the presence of anantibody-receptor complex, and therefor detecting the presence of ahuman EHOC-1 polypeptide on the cell surface.
 40. A method fordiagnosing a predisposition to a disorder associated with the expressionof a specific human EHOC-1 polypeptide allele, said method comprising:contacting a sample containing nucleic acids with a plurality of probes,wherein each probe comprises a nucleic acid sequence having at least 15bp of contiguous nucleotides capable of specifically hybridizing with asequence of nucleic acids of the nucleotide sequence set forth in SEQ IDNO: 1, and wherein each individual probe corresponds to a specific locuson chromosome 21q22.3
 41. A method according to claim 40, wherein saiddisorder is selected from progressive myoclonus epilepsy,holoprosencephaly, or autoimmune polyglandular disease.
 42. A method fordeterring the onset of symptoms associated faith particular disordercomprising administering a composition which modulates expression ofgene.
 43. A method for introducing changes at human chromosome locus21q22.3 comprising transforming a sample of cells obtained from asubject having progressive myoclonus epilepsy with the nucleic acidaccording to claim 1 along with a selective marker gene; maintainingcells in selective media; and isolating viable cells containing amodified target sequence.
 44. A method of supplying wild-type EHOC-1gene function to a cell which has a mutation/aneuploidy in the EHOC-1gene comprising introducing a wild-type EHOC-1 gene or functionalfragment thereof into said cell such that it is expressed.
 45. Singlestrand DNA primers for amplification diagnosis of progressive myoclonusepilepsy, wherein said primers comprise a nucleic acid sequence derivedfrom the nucleic acid sequence set forth as SEQ TD NO:
 1. 46. A methodfor detecting one or more EHOC-1 alleles in a sample of nucleic acidcomprising determining the presence or absence of variant nuceotidesequence in a gene contained in any of BAC clones.